A Spectro-Temporal Demodulation Technique for Pitch Estimation

نویسندگان

  • Jitendra Kumar Dhiman
  • Nagaraj Adiga
  • Chandra Sekhar Seelamantula
چکیده

We consider a two-dimensional demodulation framework for spectro-temporal analysis of the speech signal. We construct narrowband (NB) speech spectrograms, and demodulate them using the Riesz transform, which is a two-dimensional extension of the Hilbert transform. The demodulation results in timefrequency envelope (amplitude modulation or AM) and timefrequency carrier (frequency modulation or FM). The AM corresponds to the vocal tract and is referred to as the vocal tract spectrogram. The FM corresponds to the underlying excitation and is referred to as the carrier spectrogram. The carrier spectrogram exhibits a high degree of time-frequency consistency for voiced sounds. For unvoiced sounds, such a structure is lacking. In addition, the carrier spectrogram reflects the fundamental frequency (F0) variation of the speech signal. We develop a technique to determine the F0 from the carrier spectrogram. The time-frequency consistency is used to determine which time-frequency regions correspond to voiced segments. Comparisons with the state-of-the-art F0 estimation algorithms show that the proposed F0 estimator has high accuracy for telephone channel speech and is robust to noise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Improved Tonal Language Speech Recognition by Integrating Spectro-Temporal Evidence and Pitch Information with Properly Chosen Tonal Acoustic Units

We propose an improved Tandem system for tonal language speech recognition. Three different types of features, cepstral, spectro-temporal and pitch features, are integrated for modeling tone and phoneme variation simultaneously. Tonal phonemes (or tonemes) are used for MLP posterior estimation, and tonal acoustic units for HMM recognition. In our experiments conducted on Mandarin broadcast news...

متن کامل

Congenital amusia: a cognitive disorder limited to resolved harmonics and with no peripheral basis.

Pitch plays a fundamental role in audition, from speech and music perception to auditory scene analysis. Congenital amusia is a neurogenetic disorder that appears to affect primarily pitch and melody perception. Pitch is normally conveyed by the spectro-temporal fine structure of low harmonics, but some pitch information is available in the temporal envelope produced by the interactions of high...

متن کامل

Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation

A spectro-temporal modulation based singing voice detection cascaded with a Viterbi based pitch tracking algorithm is proposed in this paper for singing-voice separation from monaural recordings. To detect the singing voice, the spectrotemporal modulation energy related to voice harmonics is extracted using a spectro-temporal modulation analysis framework developed for the Fourier spectrogram. ...

متن کامل

Concurrent speaker localization using multi-band position-pitch (m-popi) algorithm with spectro-temporal pre-processing

Accurate, microphone-based speaker localization in real-world environments, like office spaces or meeting rooms, must be able to track a single speaker and multiple concurrent speakers in the presence of reverberations and background noise. Our Multiband Joint Position-Pitch (M-PoPi) algorithm for circular microphone arrays already shows a frame-wise localization estimation score of about 95% f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017